6 - Clustering

Author

CDN team

Published

Last compiled on 7 June, 2024

Clustering

Introduction

Clustering takes place on good quality cells after dimensionality reduction. After filtering out non-informative cells and reducing the dimensionality of our dataset, the data is ready to cluster. Clustering is the process of grouping cell identities based on their gene expression profiles. It is done by computing a nearest neighbor graph and then grouping cells based on their connectivity in this graph. To effectively cluster an experiment, it should be done iteratively so the least amount of information is lost or misidentified.

In this notebook, we learn:

  • Clustering in Seurat requires the computation of a nearest neighbor graph followed by the identification of clusters in this graph using functions FindNeighbors() and FindClusters().

  • The resolution parameter in FindClusters() controls the granularity of the clustering. A higher resolution results in more clusters.

  • Clustering results can be evaluated quantitatively by looking at the sample and batch distribution across clusters and a silhouette analysis of the clusters.

Review

We left off (last session) filtering out the poor quality data without regard for its distribution. Last week, we learned how principle components are calculated, what a latent space is, and defined a kNN (k-nearest neighbors) graph.

To review those:

We take a cell x gene matrix, normalize it, and reduce its dimensionality using PCA. We looked at an Elbow plot to determine the best number of PCs that accurately show the variance explained. After selecting the top 30 PCs, we generated a k-nearest neighbors (kNN) graph to represent the relationships between cells based on their gene expression profiles and ran FindClusters() to identify clusters of cells based on their neighborhood relationships. Kyle introduced Harmony as an integration technique to correct for batch effects and I just went over QC metrics including doublet detection that help us understand the quality and content of our data.

Glossary

PCA - A dimensionality reduction technique that reduces the number of dimensions in a dataset while retaining as much variance as possible. The first principal component accounts for the most variance in the data, and each subsequent component accounts for less variance.

Latent Space - The latent space is the low-dimensional representation of the data that captures the most important features.

kNN Graph - A graph that represents the relationships between cells based on their gene expression profiles. Each cell is connected to its k (1, 20, 100) nearest neighbors in the graph.

Resources

  1. Current best practices in single-cell RNA-seq analysis: a tutorial
  2. Bioconductor
  3. Single Cell Best Practices
  4. Challenges in unsupervised clustering of single-cell RNA-seq data

Load Libraries and Data

if (!requireNamespace("dplyr", quietly = TRUE))
    install.packages('dplyr')
if (!requireNamespace("tidyverse", quietly = TRUE))
    install.packages('tidyverse')
if (!requireNamespace("Seurat", quietly = TRUE))
    install.packages('Seurat')
if (!requireNamespace("colorBlindness", quietly = TRUE))
    install.packages('colorBlindness')
if (!requireNamespace("RColorBrewer", quietly = TRUE))
    install.packages('RColorBrewer')
if (!requireNamespace("cluster", quietly = TRUE))
    install.packages('cluster')
if (!requireNamespace("viridis", quietly = TRUE))
    install.packages('viridis')
if (!requireNamespace("ggplot2", quietly = TRUE))
    install.packages('ggplot2')
if (!requireNamespace("ggalluvial", quietly = TRUE))
    install.packages("ggalluvial") 
library(dplyr)
library(Seurat)
library(tidyverse)
library(RColorBrewer)
library(colorBlindness)
library(DoubletFinder)
library(cluster)
library(viridis)
library(ggplot2)
library(ggalluvial)

set.seed(687)

Load Data

# Load the Seurat object with doublet and batch information
se <- readRDS('../data/Covid_Flu_Seurat_Object_Quality.rds')
se 
An object of class Seurat 
33234 features across 59572 samples within 1 assay 
Active assay: RNA (33234 features, 3000 variable features)
 3 layers present: counts, data, scale.data
 2 dimensional reductions calculated: pca, umap
# Set color palette for cell types
pal <- paletteMartin
names(pal) <- sort(unique(se$Celltype))

donor_pal <- c(
    "#66C2A4", "#41AE76", "#238B45", "#006D2C",
    "#41B6C4", "#1D91C0", "#225EA8", "#253494", "#081D58",
    "#FFEDA0", "#FED976", "#FEB24C", "#f79736", "#FD8D3C",
    "#FC4E2A", "#E31A1C", "#BD0026", "#ad393b", "#800000", "#800050")

names(donor_pal) <- c(
    "Normal 1", "Normal 2", "Normal 3", "Normal 4",
    "Flu 1", "Flu 2", "Flu 3", "Flu 4", "Flu 5",
    "nCoV 1", "nCoV 2", "nCoV 3", "nCoV 4", "nCoV 5",
    "nCoV 6", "nCoV 7", "nCoV 8", "nCoV 9", "nCoV 10", "nCoV 11"  
)

Set Up

PCA

Let’s first look at the data according to the top 2 principle components.

celltype_pca <- DimPlot(
        se,
        reduction = "pca",
        group.by = 'Celltype',
        cols = pal
        ) 

celltype_pca

Construct the kNN graph with FindNeighbors():

FindNeighbors() is the Seurat function that calculates the k-nearest neighbors of each cell in PCA space. The number of neighbors used for this calculation is controlled by the k.param parameter with a default value of 30. It computes pairwaise distances between cells based on their gene expression using comming algorithms such as: euclidean distance, manhattan distance, Pearson correlation distance, and cosine similarity. choice matters.

From BioStars:

FindNeighbors() is a function that is used to find the nearest neighbors of your single cell data point within a dataset. It works by calculating the neighborhood overlap (Jaccard index) between every cell and its k.param nearest neighbors. It’s often employed in various applications such as anomaly detection and dimensionality reduction. The concept is that given a data point, you want to identify the closest data points to it based on some similarity metric, such as Euclidean distance or cosine similarity. This helps to identify similar points in the dataset, which can be useful for making predictions or understanding the distribution of the data.

The default values of FindNeighbors() are:

Let’s modify these to fit our analysis:

se <- se %>% 
    FindNeighbors( 
      object = se,
      reduction = "pca",
      dims = 1:30,
      k.param = 30,
      verbose = FALSE
    )
# Look at the k-nearest neighbors (nn) and shared nearest neighbors (snn) graphs computed
se@graphs
$RNA_nn
A Graph object containing 59572 cells
$RNA_snn
A Graph object containing 59572 cells

FindClusters

FindClusters() is used for identifying clusters of cells based on their neighborhood relationships typically obtained from PCA or t-SNE. It inputs the graph made from FindNeighbors() and outputs cluster assignments for each cell found at se@meta.data$seurat_clusters.

The resolution parameter controls the granularity of the clustering. Higher values of resolution will result in more clusters, while lower values will result in fewer clusters. The default value is 0.8.

From BioStars:
FindClusters() is a function used for clustering data points into groups or clusters based on their similarity. It uses a graph-based clustering approach and a Louvain algorithm. Clustering is an unsupervised learning technique where the algorithm groups similar cells together without any predefined labels. The goal is to find patterns and structure in your data. The number of clusters and the algorithm used can vary based on the problem and data characteristics. Common clustering algorithms include K-means, hierarchical clustering, and DBSCAN.


where ‘algorithm’ =

1 - original Louvain algorithm

2 - Louvain algorithm with multilevel refinement

3 - SLM (Smart Local Moving) algorithm

4 - Leiden algorithm

See ?FindClusters for more information.

Clustering

Clustering is considered a classical unsupervised machine learning task. Seurat uses the community detection algorithm, Leiden, to identify cell clusters. A community detection algorithm identifies groups of nodes in a network that as densely connected. The Leiden algorithm connects cells by finding tightly connected communities in the shared nearest neighbor graph computed in FindNeighbors. Shared nearest neighbor graphs are more robust than k-nearest neighbors graphs. sNN graphs weight edges based on the number of shared edges with other cells. Leiden is a 2019 improvement on the Louvain algorithm so it is common to see both in scRNA-seq analysis.

In clustering, the goal is not to see how many clusters you can pull apart but it is an iterative process. Especially in the first pass, you want to pull apart main cell groups such as epithelial cells and immune cells so you can further refine clusters to extract more granular cell types in the next iteration.

After clustering, we’ll review some cluster validation techniques to qualitatively and quantitatively check the quality of the clustering results.

se <- FindClusters(
      object = se,
      resolution = c(0.01, 0.05, 0.1, 0.15, 0.2,0.25)) 
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 59572
Number of edges: 3484220

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9956
Number of communities: 5
Elapsed time: 11 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 59572
Number of edges: 3484220

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9852
Number of communities: 10
Elapsed time: 12 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 59572
Number of edges: 3484220

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9763
Number of communities: 13
Elapsed time: 13 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 59572
Number of edges: 3484220

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9700
Number of communities: 15
Elapsed time: 14 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 59572
Number of edges: 3484220

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9654
Number of communities: 17
Elapsed time: 13 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 59572
Number of edges: 3484220

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9613
Number of communities: 20
Elapsed time: 14 seconds
# Find clusters based on nearest neighbors at a resolution of 0.5
# Results seen at RNA_snn_res in metadata
se@meta.data %>% select(contains("RNA_snn_res")) %>% colnames()
[1] "RNA_snn_res.0.05" "RNA_snn_res.0.01" "RNA_snn_res.0.1"  "RNA_snn_res.0.15" "RNA_snn_res.0.2"  "RNA_snn_res.0.25"

RunUMAP

UMAP runs on the PCA space. The dims parameter specifies which dimensions of the PCA space to use for the UMAP calculation. The default value is 1:30, which uses all dimensions. The n.components parameter specifies the number of dimensions in the UMAP embedding. The default value is 2.

se <- se %>% 
    RunUMAP(dims = 1:30, verbose = FALSE) # Run UMAP 

Visualize Clusters

DimPlot(
    se,
    group.by = c(
        "RNA_snn_res.0.01", "RNA_snn_res.0.05",
        "RNA_snn_res.0.1", "RNA_snn_res.0.15",
        "RNA_snn_res.0.2", "RNA_snn_res.0.25"),
    label = TRUE
)

Community detection graphs split whole datasets into more groups at higher resolution and do not subcluster based on major clusters that are present at lower resolutions. Clustering at a low resolution splits the dataset into a certain number of groups while clustering at a high resolution draws different lines on that same larger corpus of data. Below we show how this method of clustering (community detection) is different from hierarchical clustering.

se_1 <- se %>%
  FindClusters(resolution = 0.25) 
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 59572
Number of edges: 3484220

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9613
Number of communities: 20
Elapsed time: 15 seconds
metadata <- se@meta.data
metadata_1 <- se_1@meta.data
plot_data <- as.data.frame(table(metadata$seurat_clusters, metadata_1$seurat_clusters))
colnames(plot_data) <- c("Cluster 0.05", "Cluster 0.25", "Freq")
ggplot(plot_data,
       aes(axis1 = `Cluster 0.05`, axis2 = `Cluster 0.25`, y = Freq)) +
    geom_alluvium(aes(fill = `Cluster 0.05`), width = 0.1) +
    geom_stratum(width = 0.1) +
    geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
    scale_x_discrete(limits = c("Cluster 0.05", "Cluster 0.25"), expand = c(0.15, 0.05)) +
    labs(title = "Compare Cluster Resolution of 0.05 and 0.25",
         x = "Clusters",
         y = "Count") +
    theme_minimal()

Different Resolutions

0.01 vs 0.05 Resolution

DimPlot(
    se,
    group.by = c(
        "Celltype","RNA_snn_res.0.05", "RNA_snn_res.0.1"),
    label = TRUE
    )

Comparing the 0.05 and the 0.1 resolutions, Cluster 0 in the 0.05 resolution appears to split into 2 looking at Clusters 1 and 2 in resolution 0.1. While this seems helpful in sifting through the data, we’ve decided to keep the large blobs together. This saves us visibility of our dataset in the second level of annotation. Resolution 0.05 has the best distribution, but let’s quantitatively analyze it’s clusters quality.

Cluster Metrics

Cluster Diversity

Splitting the clusters by cluster size and sample diversity allows us to make sure not one sample is being clustered separately from others.

seurat_clusters <- "RNA_snn_res.0.05"
diversityPerGroup <- function(df, species, group, diversity_metric = 'simpson') {
  # Convert strings to symbols for curly-curly operator
  species_sym <- rlang::sym(species)
  group_sym <- rlang::sym(group)
  # Count groups per species directly using curly-curly
  tierNcount <- df %>%
    group_by({{species_sym}}) %>%
    count({{group_sym}}, name = "n") %>% ungroup
  # Pivot table to allow vegan::diversity call
  tierNwide <- tierNcount %>%
    pivot_wider(names_from = {{group_sym}}, values_from = n, values_fill = list(n = 0))
  # Use rownames of species for the diversity function, which needs a dataframe
  tierNwide_df <- as.data.frame(tierNwide)
  # species column must be first
  tierNwide_df <- tierNwide_df %>% select({{species}}, everything())
  rownames(tierNwide_df) <- tierNwide_df[, 1]
  tierNwide_df <- tierNwide_df[, -1]
  # Calculate diversity
  diversity_values <- vegan::diversity(tierNwide_df, index = diversity_metric)
  # Prepare result as a dataframe
  result <- data.frame(
    species = rownames(tierNwide_df),
    diversity = diversity_values,
    row.names = NULL
  )
  # Rename diversity column
  names(result)[1] <- species
  names(result)[2] <- sprintf('%s_diversity', group)
  return(result)
}

# Calculate simpson's diversity per cluster
clusterMetrics <- diversityPerGroup(se_1@meta.data,
                        species = 'RNA_snn_res.0.05',
                        group = 'Sample ID',
                        diversity_metric = 'simpson')

# Calculate number of cells per cluster and join to metrics table
clusterMetrics <- clusterMetrics %>% left_join(se_1@meta.data %>% count(RNA_snn_res.0.05))

# clusterMetrics
seurat_clusters <- "RNA_snn_res.0.05"
clusterMetrics$seurat_clusters <- as.numeric(clusterMetrics$RNA_snn_res.0.05)

div <- ggplot(clusterMetrics, aes(x = seurat_clusters, y = n)) +
  geom_segment(aes(x = seurat_clusters, xend = seurat_clusters, y = 0, yend = n),
               size = 1.5, color = 'grey80') + # Draw lines for lollipops
  geom_point(aes(colour = `Sample ID_diversity`), size = 5) + # Add colored lollipop 'heads', coloring by 'Sample ID_diversity'
  scale_y_log10() +
  scale_x_continuous(breaks = seq(0,20)) + 
  scale_colour_viridis(option = "C", name = "Sample ID Diversity", direction = 1, limits = c(0,1)) + # Colorblind-friendly, vibrant color palette
  theme_minimal(base_size = 10) +
  theme(legend.position = "bottom",
        axis.text = element_text(size = 12), 
        axis.title = element_text(size = 14), 
        title = element_text(size = 16)) +
  labs(x = "Seurat Clusters",
       y = "cluster size (log-scaled)",
       title = "Cluster Diversity Metrics") +
  guides(colour = guide_colourbar(title.position = "top", title.hjust = 0.5))

div

Cluster 4 appears to be the least diverse in terms of sample diversity. Let’s investigate which samples are in each cluster to see if there is any bias.

Plot sample distribution per cluster

# Assuming se_1@meta.data has columns 'seurat_clusters' and 'Sample ID'

# Prepare the data for plotting
plot_sample <- se_1@meta.data %>%
  count(RNA_snn_res.0.05, `Sample ID`) %>%
  group_by(RNA_snn_res.0.05) %>%
  mutate(proportion = n / sum(n)) %>%
  ungroup()

# Create a stacked bar plot
ggplot(plot_sample, aes(x = factor(RNA_snn_res.0.05), y = proportion, fill = `Sample ID`)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(x = "Seurat Clusters", y = "Proportion", fill = "Sample ID",
       title = "Distribution of Sample IDs Across Clusters") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  scale_fill_manual(values = donor_pal)

Sample Flu 1 appears to make up the majority of Cluster 4.

Another way to visualize cluster diversity is through a stacked bar plot instead of lollipops. This provides a super clear way to compare these numercial metrics side by side.

p2 <- ggplot(clusterMetrics, aes(y=as.character(seurat_clusters), fill=`Sample ID_diversity`, x = 1, label = n)) +
  geom_tile(colour = "white") +
  geom_text(nudge_x = 1.5, size = 3) +
  geom_text(aes(label = signif(`Sample ID_diversity`, 2)),size = 3) +
  scale_fill_distiller(palette = "Spectral", limits = c(0,1)) + theme_classic() +
  coord_fixed(ratio = 0.25) + 
  expand_limits(x = c(0.5,3)) +
  labs(x = "Diversity                                   Size") +
  theme(axis.text.y = element_text(hjust = 1, vjust = 0.5, size = 12),
        axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.title.y = element_text(size = 15),
        legend.key.size = unit(1, 'cm'),
        legend.title = element_text(size=10), 
        legend.text = element_text(size=10)
  ) 
p2

Visualizing cluster diversity based on batch is also an important thing to look out for. This is similar to the QC motivation where these plots provide a way to explain your data and possible findings before claiming any biological phenomena.

# Prepare the data for plotting
plot_batch <- se_1@meta.data %>%
  count(RNA_snn_res.0.05, batch) %>%
  group_by(RNA_snn_res.0.05) %>%
  mutate(proportion = n / sum(n)) %>%
  ungroup()

# Create a stacked bar plot
ggplot(plot_batch, aes(x = factor(RNA_snn_res.0.05), y = proportion, fill = batch)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(x = "Seurat Clusters", y = "Proportion", fill = "Batch",
       title = "Distribution of Batches Across Clusters") +
  # theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  scale_color_manual(values = donor_pal)

Silhouette Analysis

As covered in the last workshop, silhouette analysis is a way to measure how similar an object is to its own cluster compared to other clusters. The silhouette value ranges from -1 to 1, where a high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters.

seurat_clusters <- se_1@meta.data$RNA_snn_res.0.05

pca_embeddings <- Embeddings(se, reduction = 'pca')

# Calculate silhouette widths
sil_scores <- silhouette(x = as.numeric(seurat_clusters), dist = dist(pca_embeddings))

# Extract silhouette scores
silhouette_data <- as.data.frame(sil_scores)
# Recover cell type names
silhouette_data$seurat_clusters <- as.character(seurat_clusters)

row.names(silhouette_data) <- row.names(pca_embeddings)

silhouette_arranged <- silhouette_data %>% 
  group_by(seurat_clusters) %>% 
  arrange(-sil_width)
overall_average <- silhouette_arranged %>% 
  ungroup %>% 
  summarize(ave = as.numeric(mean(sil_width))) %>% 
  pull(ave)

full_plot <- ggplot(silhouette_arranged, 
                    aes(x = sil_width, 
                        y = seurat_clusters, 
                        fill = seurat_clusters, 
                        group = seurat_clusters)) +
    geom_bar(stat = "identity", position = 'dodge2') +
    geom_vline(xintercept = overall_average,
               color = 'red',
               linetype = 'longdash') +
    theme_minimal() +
    labs(title = "Silhouette Analysis",
        y = "Cluster",
        x = "Silhouette width",
        fill = "Cluster") +
    theme(axis.text.y = element_text(hjust = 1, vjust = 0.5, size = 20),
        axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.title.y = element_text(size = 20),
        legend.position = "None")

full_plot

The dashed red lines shows the average silhouette width across all clusters. The silhouette width is a measure of how similar an object is to its own cluster compared to other clusters. A high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters. The mystery of Cluster 4 is shown a bit more clearly here as the plot shows that Cluster 4 is more similar to other clusters than to itself.

So let’s look at silhouette scores on a UMAP

d5 <- DimPlot(se,
        reduction='umap',
        group.by='RNA_snn_res.0.05', 
        label = TRUE)

se_1$CellID <- row.names(se_1@meta.data)

sil_ids <- silhouette_data %>% rownames_to_column('CellID') %>% left_join(se_1@meta.data, by='CellID')
se_1 <- AddMetaData(se_1, sil_ids)

FeaturePlot(
  se_1, 
  feature = "sil_width") + 
  ggtitle('Silhouette width') + 
  scale_color_viridis(limits = c(-1,1), 
                        option = "magma") | d5

Lastly, we should look at cells that might be between cell types or proliferating. These are often cells in transition and can be clustered across the UMAP.

p1 <- FeaturePlot(
  se,
  features = c("MS4A1")
)
p2 <- DimPlot(
    se,
    group.by = "RNA_snn_res.0.05",
    label = TRUE
  )
p1 | p2

Extra: More on Clustering Algorithms

The Louvain algorithm was developed in 2008 and is a popular community detection algorithm used in scRNA-seq. It recursively merges communities into a single node and executes the modularity clustering on the condensed graphs.(Zhang) Both Seurat and scanpy use Louvain as the default clustering algorithm.


Leiden Algorithm

The Leiden algorithm was published in 2020 as an improvement of the Louvain algorithm. Leiden creates clusters by taking into account the number of links between cells in a cluster versus the overall expected number of links in the dataset. It computes a clustering on a KNN graph obtained from the PC reduced expression space. It starts with an initial partition where each node from its own community. Next, the algorithm moves single nodes from one community to another to find a partition, which is then refined. Based on a refined partition an aggregate network is generated, which is again refined until no further improvements can be obtained, and the final partition is reached. (Single Cell Best Practices).

There are a couple of popular clustering algorithms. There is no one way to cluster as clustering is a means of looking at the data from different angles. The most popular clustering algortihms are the louvain algorithm, leiden algorithm, hierarchical clustering, and k-means clustering. Seurat uses the Leiden algorithm by default which is an improvement on the Louvain algorithm.

Session Info

sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.5

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggalluvial_0.12.5    viridis_0.6.5        viridisLite_0.4.2    cluster_2.1.6        DoubletFinder_2.0.4  colorBlindness_0.1.9 RColorBrewer_1.1-3   lubridate_1.9.3      forcats_1.0.0        stringr_1.5.1        purrr_1.0.2          readr_2.1.5          tidyr_1.3.1          tibble_3.2.1         ggplot2_3.5.1        tidyverse_2.0.0      Seurat_5.0.3         SeuratObject_5.0.2   sp_2.1-4             dplyr_1.1.4         

loaded via a namespace (and not attached):
  [1] rstudioapi_0.16.0      jsonlite_1.8.8         magrittr_2.0.3         spatstat.utils_3.0-4   farver_2.1.2           rmarkdown_2.27         vctrs_0.6.5            ROCR_1.0-11            spatstat.explore_3.2-7 htmltools_0.5.8.1      gridGraphics_0.5-1     sctransform_0.4.1      parallelly_1.37.1      KernSmooth_2.23-22     htmlwidgets_1.6.4      ica_1.0-3              plyr_1.8.9             plotly_4.10.4          zoo_1.8-12             igraph_2.0.3           mime_0.12              lifecycle_1.0.4        pkgconfig_2.0.3        Matrix_1.6-5           R6_2.5.1               fastmap_1.2.0          fitdistrplus_1.1-11    future_1.33.2          shiny_1.8.1.1          digest_0.6.35          colorspace_2.1-0       patchwork_1.2.0        tensor_1.5             RSpectra_0.16-1        irlba_2.3.5.1          vegan_2.6-6            labeling_0.4.3         progressr_0.14.0       fansi_1.0.6            spatstat.sparse_3.0-3  timechange_0.3.0       mgcv_1.9-1             httr_1.4.7             polyclip_1.10-6        abind_1.4-5            compiler_4.3.2         withr_3.0.0            fastDummies_1.7.3      MASS_7.3-60.0.1        permute_0.9-7          tools_4.3.2           
 [52] lmtest_0.9-40          httpuv_1.6.15          future.apply_1.11.2    goftest_1.2-3          glue_1.7.0             nlme_3.1-164           promises_1.3.0         grid_4.3.2             Rtsne_0.17             reshape2_1.4.4         generics_0.1.3         gtable_0.3.5           spatstat.data_3.0-4    tzdb_0.4.0             data.table_1.15.4      hms_1.1.3              utf8_1.2.4             spatstat.geom_3.2-9    RcppAnnoy_0.0.22       ggrepel_0.9.5          RANN_2.6.1             pillar_1.9.0           spam_2.10-0            RcppHNSW_0.6.0         later_1.3.2            splines_4.3.2          lattice_0.22-6         survival_3.6-4         deldir_2.0-4           tidyselect_1.2.1       miniUI_0.1.1.1         pbapply_1.7-2          knitr_1.45             gridExtra_2.3          scattermore_1.2        xfun_0.44              matrixStats_1.3.0      stringi_1.8.4          lazyeval_0.2.2         yaml_2.3.8             evaluate_0.23          codetools_0.2-20       cli_3.6.2              uwot_0.1.16            xtable_1.8-4           reticulate_1.36.1      munsell_0.5.1          Rcpp_1.0.12            globals_0.16.3         spatstat.random_3.2-3  png_0.1-8             
[103] parallel_4.3.2         dotCall64_1.1-1        listenv_0.9.1          scales_1.3.0           ggridges_0.5.6         leiden_0.4.3.1         rlang_1.1.3            cowplot_1.1.3